LE CODE EN COMMENTAIRE, C’EST JUSTE POUR POUVOIR KNIT PLUS VITE!! (avec cache = TRUE, ca re-run pas le code, mais le markdown doit quand meme compiler les graphes, du coup ca prend du temps à chq fois…)
names(cred) <- tolower(names(cred)) # changing the name of the variable into lowercase
cred <- cred[,-1] # deleting the first useless column
sum(is.na(cred)) # checking if we have missing data
Parler de: - méthode utilisée - grandes étapes - classification task (goal) =/= prediction - [à compléter…]
–> Exploration des données = 1st insight –> Modelling: train/test set + selection du “meilleur” arbre(pruning)/svm(tuning)/NNET(nb of layers)/regression logicstic/K-NN(choisir K)/etc. + presentation de détail de tout ce bordel –> Cross validation avec les meilleurs de chq models pour selectionner le grand gagnant…
–> Description des variables (comme ca c’est plus simple de s’en sortir?!)
Before beginning any kind of analysis, we have to understand the data we are working with.
data.frame(variable = names(cred),
classe = sapply(cred, typeof),
first_values = sapply(cred,
function(x) paste0(head(x), collapse = ", ")),
row.names = NULL) %>%
kable(caption="Overview of our data") %>%
kable_styling(bootstrap_options = c("striped", "hover", "condensed"),
full_width = F) %>%
column_spec(1, width = "10em", border_right = T) %>%
column_spec(2, width = "6em") %>%
column_spec(3, width = "18em") %>%
scroll_box(width = "65%", height = "250px")
| variable | classe | first_values |
|---|---|---|
| chk_acct | integer | 0, 1, 3, 0, 0, 3 |
| duration | integer | 6, 48, 12, 42, 24, 36 |
| history | integer | 4, 2, 4, 2, 3, 2 |
| new_car | integer | 0, 0, 0, 0, 1, 0 |
| used_car | integer | 0, 0, 0, 0, 0, 0 |
| furniture | integer | 0, 0, 0, 1, 0, 0 |
| radio.tv | integer | 1, 1, 0, 0, 0, 0 |
| education | integer | 0, 0, 1, 0, 0, 1 |
| retraining | integer | 0, 0, 0, 0, 0, 0 |
| amount | integer | 1169, 5951, 2096, 7882, 4870, 9055 |
| sav_acct | integer | 4, 0, 0, 0, 0, 4 |
| employment | integer | 4, 2, 3, 3, 2, 2 |
| install_rate | integer | 4, 2, 2, 2, 3, 2 |
| male_div | integer | 0, 0, 0, 0, 0, 0 |
| male_single | integer | 1, 0, 1, 1, 1, 1 |
| male_mar_or_wid | integer | 0, 0, 0, 0, 0, 0 |
| co.applicant | integer | 0, 0, 0, 0, 0, 0 |
| guarantor | integer | 0, 0, 0, 1, 0, 0 |
| present_resident | integer | 4, 2, 3, 4, 4, 4 |
| real_estate | integer | 1, 1, 1, 0, 0, 0 |
| prop_unkn_none | integer | 0, 0, 0, 0, 1, 1 |
| age | integer | 67, 22, 49, 45, 53, 35 |
| other_install | integer | 0, 0, 0, 0, 0, 0 |
| rent | integer | 0, 0, 0, 0, 0, 0 |
| own_res | integer | 1, 1, 1, 0, 0, 0 |
| num_credits | integer | 2, 1, 1, 1, 2, 1 |
| job | integer | 2, 2, 1, 2, 2, 1 |
| num_dependents | integer | 1, 1, 2, 2, 2, 2 |
| telephone | integer | 1, 0, 0, 0, 0, 1 |
| foreign | integer | 0, 0, 0, 0, 0, 0 |
| response | integer | 1, 0, 1, 1, 0, 1 |
As we can see, the data are coherent with the infos that “the client” provided us. Most of them are binary or categorical, while only few are numerical.
More than seeing the first values of our variables and their types, we also need to understand how distributed they are and their link with each other. Thanks to a correlation plot, we can see the correlation between each pair of variable, but especially their correlation with our response variable in which we are interested in.
We see for instances that variables like chk_acct, duration, history, sav_accnt or rent are highly correlated (positively or negatively) with our outcome variable and that they will be likeli to influe it in the models that we are going to plot. Others like present_resident or retaining should have low impact.
plot_correlation(cred, type="all", title = "Correlation Graph") # Attention! Il faut seulement ploter avec les valeurs continues. Les catégoriques se feront avec le khi-2! Test de wilkockson pour tester si les groupes sont différents (comme je le comprends, voir si une valeur cotinue a une influence sur l'outcome des groupes). Ensuite, fait des tests d'indépendance sur les discrètes pour voir si elles ont une influence sur les valeurs de réponse.
In addition, we can appreciate the summary of the different variables. The frequency table of history is presented below as an example:
| Value | |
|---|---|
| Min. | 0.000 |
| 1st Qu. | 2.000 |
| Median | 2.000 |
| Mean | 2.545 |
| 3rd Qu. | 4.000 |
| Max. | 4.000 |
| Values | Frequency |
|---|---|
| 0 | 40 |
| 1 | 49 |
| 2 | 530 |
| 3 | 88 |
| 4 | 293 |
However, presenting such a summary for all variables can be long and boring. It can be better to represent these number visually. A Boxplot is optimal to get all the important values for the numerical data, while a barplot will give us strong insights for categorical data. Let’s appreciate the following graphs:
for (i in 1:(length(cred)-1)) {
if (range(cred[, i] < 5)) {
print(
ggplot(cred, aes(x = cred[, i])) +
geom_bar(stat = "count", position = "dodge") +
ggtitle(str_c("Barplot of\n", paste(
colnames(cred[i])
))) +
xlab(colnames(cred[i])) +
ylab("Total") +
my_theme()
)
} else
{
print(
ggplot(cred, aes(y = cred[, i])) + geom_boxplot() +
ylab(colnames(cred[i])) +
ggtitle(str_c("Boxplot of\n ", paste(
colnames(cred[i])
))) +
my_theme() +
theme(
axis.text.x=element_blank())
)
}
}
Thanks to these graphs, we can better understand our data at a glance and will be able to refer to them when needed.
In addition, these graphs enable un too see that some data are not tidy. For instance, education should be a binary variable. However, we can see on the histogram of this variable that we have data where \(-1\) were recorded. We have the same problem for the binary variable guarantor were a value \(2\) is present.
In addition, we can also have strong suspitions that the variable age has wrong recorded data as we can see an outlier with a value much bigger than 100.
We will have to confirm our first assumptions and to modify these dirty data in an appropriate way.
Let’s first look at our variable age. We assume that, generally, a person will not live more than a hundred year, and will not contract a credit at such age. This is why the data with \(Age > 100\) are most likely wrongly recorded. We will therefore have to replace them in our database.
First, we have to find how much data are potentially dirty according to our assumptions and to localise them in order to replace them.
| Var1 | Freq |
|---|---|
| FALSE | 999 |
| TRUE | 1 |
| x |
|---|
| 537 |
According to our results, we have one data with \(age > 100\) that has to be replaced. It is the instance 537 and its value is 125.
We can consider different options to replace this value. The first one could be to replace it by a value at random within the range (a value at random between 19 and 75, which is the second lowest value after \(125\).
However, according to the following histogram, the distribution of the age (without the erroneous data) is inequal with a concentration around small values (which is logical as young people generally have less money than elders and therefore are more subject to ask for credits).
It could therefore be possible to replace it at random with different probabilities according to the size of each class.
We prefer to opt for the median (equal to 33) to replace our problematic value as it offers more convenience.
Note that for calculating the median, our problematic value should not be used.
cred$age[which(age>75)] <- median(age[age!=max(age)])
An alternative could have been to use the mean, but, as we have no really big outlier, both values would have been close to eachother (\(mean = 35.596\) while \(median = 33\)).
Next, we also have to deal with our two categorical data that have been wrong recorded:
- one in education
- one in guarantor
They also have to be cleaned.
The following is again the barplot of education.
Here, the likelihood that this wrong recorded data is equal to \(0\) is clearly higher. Therefore, each of the previously presented methods (using the mean, using the median and even assigning it to a class at random) would with a high probability result in assigning this instance and assign it the value \(Education = 0\).
We can confirm these first assumption with a frequency table:
| -1 | 0 | 1 | |
|---|---|---|---|
| sample size | 1 | 950 | 49 |
| proportion | 0.1 % | 95 % | 4.9 % |
It is indeed more appropriate to replace our value by 0 as the probability of belonging to this class is close to 20 times bigger.
cred$education[which(education==-1)] <- 0
Concerning the variable guarantor, we can look at the frequency table and plot the barplot as well:
| 0 | 1 | 2 | |
|---|---|---|---|
| sample size | 948 | 51 | 1 |
| proportion | 94.8 % | 5.1 % | 0.1 % |
Again, for the same reasons, it is preferable to replace the wrong recorded data by 0.
cred$guarantor[which(guarantor==2)] <- 0
In addition, the variable present_resident is also problematic as it doesn’t have the same range as the other categorical values. Its range goes from 1 to 4 whereas it should go from 0 to 3 like the other ones. We can modifiy its values in order to have the same format everywhere.
cred$present_resident <- subtract(cred$present_resident, 1)
These first steps have enabled us to better understand our explanatory variables and to clean the problematic ones.
We now have to focus in detail to the response variable on which the predictions should be made.
As our final goal is to predict if a customer should be classified as a risky one or not, we have to have a particular look at our response variable that establishes if an applicant presents a good or a bad risk.
Let’s first have a look at its distribution:
| 0 | 1 | |
|---|---|---|
| sample size | 300 | 700 |
| proportion | 30 % | 70 % |
As we can see, if a random customer steps in the bank, the a priori probability that he will present a good ranking will be of 70%.
Without any calculations, the bank has more chances to make a good decision when octroying a credit.
However, the consequences can be really dramatic if 30% of the credits that the bank gives are not totally reimbursed. That’s why we have to develop a model to improve this initial accuracy that is obtained using a naive method of always octroying a credit.
Optimally, this model should also minimise the number of credits that are predicted as “good” and that are actually “bad” as the consequences for the bank (reimbursement of the credit by the customer) can be much more dramatic in this situation than in a situation where a “bad” credit is predicted as “good”.
We will get to this later. First, after having presented each variable, it could be interesting to see if we can already have some assumptions concerning the relations between the explanatory variables and the response variable.
Text…
–> Expliquer qu’on peut deja avoir un premier apriori sur les variables qui vont avoir un impact
** PAS TOP LES BOXPLOTS POUR LES VARIABLES BINAIRES ??!!**
Before beginning to work on our different models, we can create a test set and a training set in order to build the models. This procedure is used is order to avoid overfitting and to predict instances which have been used while building the model.
In order to evalate the performance of the different models, we will have to use the exact same training and test sets for each model to be sure that the performance differences will result from the model we use and not from the randomess of splitting differently both sets.
A Cross-Validation will be performed at the end in order to compare the different models and to choose which one should be used to make good predictions.
# Creation of testing and training sets
set.seed(1234)
index.train <- sample(1:nrow(cred), size = nrow(cred) * 0.7, replace = FALSE)
cred.train <- cred[index.train,]
cred.test <- cred[-index.train,]
In addition, to evaluate our models, we will use the accuracy as main measure of performance. However, computing the accuracy for each model can be quite long… We prefer to build a function to be able to retrieve at any time based on a confusion matrix:
# Creating of a function to retrieve the accuracy from a confusion matrix
accuracy <- function(c){
print(sum(diag(c))/sum(c))
}
We can now begin to work on the different models that we will use in order to make our predictions.
First, we begin our analysis by using a decision (classification) tree.
The goal of a decision tree is to predict the final class of our response variable (“good” or “bad”) by using a succession of binary rules to apply to our data.
Each node is created thanks to an algorithm that aims to minimize an impurity criterion. The feature and its underlying value that maximises the impurity reduction (that “best splits”the dataset in two) will be selected. This procedure is repeated until a stopping rule is reached.
At the end, we will have multiple branches that will all lead to the final forecast that we will make for a given instances.
Better than words, let’s compute the model and have a look at it.
cart.model <- rpart(response ~ ., data = cred.train, method = "class")
rpart.plot(cart.model, main="Original decision tree")
–> At this time, we have obtained a decision tree that can be used to do the predictions. At it node, one should look at the values of the data and take the appropriate direction till arriving at the last row, when the prediction can be made.
However, this tree is also really complex: there are a lot of split and of branches.
We have to tune this initial model and make it more simple. We will simplify this model without loosing predictive capabilities by pruning it, keeping only the most important splits linked to the most important variables.
As already said, our complex tree has to be pruned in order to reduce its complexity. To do so, we will use the 1 - SE rule.
The idea of this rule is really general. As one would establish a t-test in statistics to see if two measures are statistically different, the 1 - SE rule tries to establish if two models produce statistically different results (if we can affirm that one outperforms the other).
We therefore consider the xerror (the criterion in which we are interested in and that we want to minimize) and its standard deviation xstd. A models that falls within 1 standard deviation of the most complex model can be considered as equivalent in term of performance. Therefore, using this rule, we will be able to prune the tree to get a much simplier model, without loosing in quality as the performance capability of our new model will be statistically the same as the first one.
To select the size of our pruned tree, we will look at the xerror of our original tree (the most complex one) and add one standard error xstd. We will then select the simplier tree with an xerror that lies within the calculated value.
Let’s have a look at these values from our original tree and decide where to prune it.
| CP | nsplit | rel error | xerror | xstd |
|---|---|---|---|---|
| 0.0572139 | 0 | 1.0000000 | 1.0000000 | 0.0595529 |
| 0.0398010 | 2 | 0.8855721 | 0.9900498 | 0.0593745 |
| 0.0298507 | 3 | 0.8457711 | 0.9950249 | 0.0594641 |
| 0.0199005 | 4 | 0.8159204 | 0.9452736 | 0.0585352 |
| 0.0149254 | 8 | 0.7114428 | 0.9303483 | 0.0582417 |
| 0.0124378 | 11 | 0.6666667 | 0.9651741 | 0.0589157 |
| 0.0100000 | 13 | 0.6417910 | 0.9651741 | 0.0589157 |
| x | |
|---|---|
| chk_acct | 30.415366 |
| duration | 24.535462 |
| amount | 16.384878 |
| history | 9.693259 |
| sav_acct | 9.544539 |
| employment | 6.126322 |
Accoring to our first table, the minimal xerror is equal to 0.9303483 , the minimal xstd to 0.0582417 and the sum of both is therefore equal to 0.98859.
The smallest tree with an xerror below this value is equal to 0.9452736 and we will therefore prune the tree at CP = 0.0199005. According to our previous table, this represents 4 splits to be kept, equivalent to a tree of size 5.
It is also possible to observe this value on the graph where the dash line indicates the xerror plus its xstd from the most complex tree. We therefore select the less complex tree under this line, which is the tree of size 5 (or, again, with 4 splits).
We thus prune the tree at this value (the R function requires to indicate CP.)
cart.pruned <- prune(cart.model, cp = cp.pruned)
We can visualize this new pruned tree that is much smaller and therefore less complex than our original one and will use it to do our predictions, to build the confusion matrix and to calculate the underlying accuracy.
pred.cart <- predict(cart.pruned, newdata = cred.test, type = "class")
# confusion matrix:
cart.tab <- table(Predictions = pred.cart, Observations = cred.test$response)
cart.tab %>% kable(caption = "Confusion matrix for the CART model",
col.names = c("Predict bad", "Predict good")) %>%
kable_styling(
bootstrap_options = c("striped", "hover", "condensed"),
full_width = F,
position = "l") %>%
column_spec(1, border_right = T, width = "5em") %>%
column_spec(2, width = "6em") %>%
column_spec(3, width = "6em")
| Predict bad | Predict good | |
|---|---|---|
| bad | 27 | 12 |
| good | 72 | 189 |
Based on this table, we can calculate the accuracy using our previously build function, and that calculates the element weel classified divided by the total number of elements.
# accuracy using our previously build function
accuracy(cart.tab)
## [1] 0.72
It is possible to have more informaion, with for instance the sum of each rows with the following cross table.
Cell Contents
|-------------------------|
| N |
| N / Row Total |
| N / Col Total |
| N / Table Total |
|-------------------------|
Total Observations in Table: 300
| pred.cart
cred.test$response | bad | good | Row Total |
-------------------|-----------|-----------|-----------|
bad | 27 | 72 | 99 |
| 0.273 | 0.727 | 0.330 |
| 0.692 | 0.276 | |
| 0.090 | 0.240 | |
-------------------|-----------|-----------|-----------|
good | 12 | 189 | 201 |
| 0.060 | 0.940 | 0.670 |
| 0.308 | 0.724 | |
| 0.040 | 0.630 | |
-------------------|-----------|-----------|-----------|
Column Total | 39 | 261 | 300 |
| 0.130 | 0.870 | |
-------------------|-----------|-----------|-----------|
We can see that 99 are predicated as bad and 201 are predicted as good.
–> Parler de accuracy en elle meme (0.74 pas top par rapport aux 0.7 de base si on prédit que “good”) –> Parler des bonnes predictions de 0 (c’est elles qui nous intéressent et prédire que 1 conduirait la banque à faire faillite) –> Parler FALSE POSTIVIE / FALSE NEGATIVE / TRUE POSITIVE / TRUE NEGATIVE
We present you the characteristics of the model retained, with 15 neurones in the hidden layer and a decay of 1. As usual, we trained the model and predict the values of the test set to be able to build the confusion matrix and to calculate the accuracy.
nnet.model.retained <-
nnet(cred.train$response ~ .,
data = cred.train, maxit = 200,
size = nnet_fit$results[which.max(nnet_fit$results$Accuracy),]$size,
decay = nnet_fit$results[which.max(nnet_fit$results$Accuracy),]$decay)
# Predictions on the test set:
pred.nnet.retained <- predict(nnet.model.retained, cred.test, type="class")
# Confusion matrix
tab.nnet.retained <- table(Reality = cred.test$response,
Predicted = unlist(pred.nnet.retained))
# Accuracy
acc.nnet.retained <- sum(ifelse(
cred.test$response == unlist(pred.nnet.retained), 1, 0), na.rm = TRUE) /
length(cred.test$response)
In addition, we can plot our neural network:
This graph shows that our model can be quite complex, there a plenty of arrows. It is not necessary to understand precisely all of them and they can be seen as a “black box”, meaning that the interpretability of such a model is modest. But, afterall, what we need is to make good predictions!
| Predict good | Predict bad | |
|---|---|---|
| bad | 53 | 46 |
| good | 30 | 171 |
We obtain a final accuracy of 0.747 on our testing set. Note that this accuracy is different from the one using the cross validation before as here, we use only one test set that may be disproportioned. We will, at the end, use again a cross validation for all models on the same train and test sets in order to compare the different models.
# kknn_parameters <- expand.grid(k = 2:30, distance = 1:5, kernel= "optimal")
#
# kknn_fit <- train(form = response~ .,
# data = cred,
# trControl = train_control,
# tuneGrid = knn_parameters,
# method = "kknn",
# preProcess = c("center", "scale"))
# plot(kknn_fit)
# kknn_fit$results
#
# kknn_fit$results[which.max(knn_fit$results$Accuracy),]$k
# kknn_fit$results[which.max(knn_fit$results$Accuracy),]$distance
After having created multiple K-Nearest Neighbors models with the knn function, we realize that the distance of 2 outperforms other distances. We can therefore use the function knn to illustrate how our final model is built and how we select k (note that knn function is more efficient in term of computationnal power, only reason why we use it to choose k rather than the kknn function just above.)
We can thus use the knn function to see how the accuracy varies with different k between 2 an 30 and select the number of neighbors that lead to the highest accuracy.
knn_parameters <- expand.grid(k = 2:30)
knn_fit <- train(form = response~ .,
data = cred,
trControl = train_control,
tuneGrid = knn_parameters,
method = "knn",
preProcess = c("center", "scale"))
plot(knn_fit)
knn_fit$results
## k Accuracy Kappa AccuracySD KappaSD
## 1 2 0.692 0.2462395 0.02683282 0.06888221
## 2 3 0.699 0.2441172 0.03286335 0.07808652
## 3 4 0.695 0.2196982 0.04138236 0.10429467
## 4 5 0.722 0.2639107 0.02387467 0.05968080
## 5 6 0.716 0.2446454 0.01816590 0.04808965
## 6 7 0.731 0.2744440 0.02534758 0.06783559
## 7 8 0.720 0.2452409 0.03221025 0.08911043
## 8 9 0.727 0.2605492 0.02413504 0.06231199
## 9 10 0.730 0.2655259 0.01457738 0.04736117
## 10 11 0.736 0.2709487 0.01294218 0.04068508
## 11 12 0.742 0.2884146 0.01483240 0.05555365
## 12 13 0.740 0.2704959 0.00500000 0.03427498
## 13 14 0.746 0.2892835 0.01083974 0.03449258
## 14 15 0.738 0.2597819 0.01680774 0.05662844
## 15 16 0.741 0.2658625 0.01140175 0.04409081
## 16 17 0.737 0.2526184 0.01151086 0.04869526
## 17 18 0.745 0.2791824 0.01541104 0.04902223
## 18 19 0.734 0.2302861 0.02133073 0.08055620
## 19 20 0.731 0.2263990 0.01635543 0.06719721
## 20 21 0.734 0.2228327 0.02434132 0.09057878
## 21 22 0.728 0.1997635 0.02079663 0.08504872
## 22 23 0.730 0.1999490 0.02423840 0.09447919
## 23 24 0.728 0.1968037 0.02109502 0.07736229
## 24 25 0.729 0.1979697 0.02247221 0.08991717
## 25 26 0.722 0.1761069 0.02252776 0.09150449
## 26 27 0.729 0.1963998 0.01917029 0.07845248
## 27 28 0.734 0.2076551 0.02724885 0.10442122
## 28 29 0.730 0.1871680 0.01369306 0.06472995
## 29 30 0.726 0.1735408 0.01596872 0.06421134
knn_fit$results[which.max(knn_fit$results$Accuracy),]$k
## [1] 14
The selected knn model uses a distance of 2 and is computed with the 14 neerest neighbors. The obtained accuracy using a 10 fold Cross-Validation is equal to 0.746.
| Predict bad | Predict good | |
|---|---|---|
| bad | 27 | 12 |
| good | 72 | 189 |
Given our confusion matrix, on our initial testing set, the obtained accuracy is equal to .
–> Expliquer pourquoi on fait une cross-validation –>
Creation of 10 different sets in order to do the Cross-Validation. The number of instances being 1000, we will make 10 sets of size 100. They are stored in a list named test.list. At each step, the remaining part of the data base is stored in the list train.test.
test.list <- list() # creates an empty list that will be the test sets
train.list <- list() # creates an empty list that will be the train sets
counter <- 0
# Creates the 10 sets of size 300
for (i in 1:10){
index <- counter + c(1:100) # the row numbers that will be in the test set
test.list[[i]] <- cred[index, ] # the test set number i
train.list[[i]] <- cred[-index, ] # the train set number i
counter <- counter + 100
}
For example, test.list[[1]] is a data set of 300 rows taken at random from our data.
# test.list[[1]] %>% kable(caption="Our first test set named test.list[[1]]") %>%
# kable_styling(bootstrap_options = c("striped", "hover", "condensed"),
# full_width = F) %>%
# scroll_box(width = "100%", height = "200px")
#
#
# train.list[[1]] %>% kable(caption="Train set associated with test.list[[1]]") %>%
# kable_styling(bootstrap_options = c("striped", "hover", "condensed"),
# full_width = F) %>%
# scroll_box(width = "100%", height = "200px")
acc.cart.cv <- numeric(10)
for (i in 1:10){
cart.cv <- rpart(response~., data=train.list[[i]]) # original tree
# Tree to be prune at:
cp.pruned <- cart.model$cptable[(
min(which(cart.model$cptable[, ncol(cart.model$cptable) - 1] <
cart.model$cptable[nrow(cart.model$cptable),
ncol(cart.model$cptable)] +
cart.model$cptable[nrow(cart.model$cptable),
ncol(cart.model$cptable) - 1]))), 1]
# the pruned tree for predictions
cart.pruned.cv <- prune(cart.model, cp = cp.pruned)
# making the predictions
cart.pred.cv <- predict(cart.pruned.cv, newdata=test.list[[i]], type="class")
# the confusion matrix
tab.cart.cv <- table(test.list[[i]]$response, cart.pred.cv)
# the final accuracy
acc.cart.cv[i] <- accuracy(tab.cart.cv)
}
acc.cart.cv
## [1] 0.75 0.68 0.77 0.72 0.72 0.59 0.70 0.68 0.71 0.68
mean(acc.cart.cv)
## [1] 0.7
sd(acc.cart.cv)
## [1] 0.04898979
# train_control <- trainControl(method = "cv", number = 10)
#
# knn_parameters <- data.frame (k = 3:30)
#
# knn_fit <- train(form = response~ .,
# data = cred,
# trControl = train_control,
# tuneGrid = knn_parameters,
# method = "knn",
# preProcess = c("center", "scale"))
# plot(knn_fit)
# knn_fit$results
train_control <- trainControl(method = "cv", number = 10)
nnet_fit <- train(form = response~ .,
data = cred,
trControl = train_control,
tuneGrid = expand.grid(size = 10:20, decay = 1:5),
method = "nnet")
plot(nnet_fit)
nnet_fit
nnet_fit$results